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One  of  Robert  Glaser's  special  contributions  to  psychology  and  education  is  the  concept  of  i 

criterion-referenced  testing  (Glaser,  1963).  While  norm-referenced  testing  supports  decisions  that  i 

involve  choosing  among  people  or  otherwise  comparing  them,  criterion-referenced  tests  tell  us  i. 

something  about  what  people  know  or  what  they  can  do.  In  introducing  the  concept,  Glaser  was 
beginning  a  long  advocacy  of  adaptive  education,  of  shaping  education  to  each  person's  current 
competences  rather  than  choosing  to  educate  only  the  people  who  score  highest  on  general  tests. 

While  this  was  his  goal,  most  work  on  criterion-referenced  testing  (cf  Hambleton,  1984)  has 
focused  on  issues  relating  to  certification,  to  setting  of  standards  for  educational  outcomes,  and  to 
tracking,  that  is,  on  selection  more  than  on  adaptation.  There  are  a  number  of  reasons  for  this,  but  the 
situation  can  be  summarized  as  follows.  Adaptive  education  is  a  steering  process,  \orm-referenced 
tests  are  designed  to  indicate  reliably  who  is  out  in  front;  criterion-referenced  tests  are  designed  to  tell 
us  exactly  where  each  person  is;  but  knowing  where  you  are  is  not  the  same  as  knowing  how  to  steer  a 
course  toward  a  planned  destination. 

The  purpose  of  this  chapter  is  to  illustrate  one  way  in  which  the  technologies  of  testing  might 
combine  with  certain  cognitive  science  techniques  to  help  steer  instruction.  We  focus  on  steering  an 
intelligent  tutor,  i.e.,  on  student  modeling.  However,  the  approach  can  be  generalized  to  other 
instructional  forms,  including  reactive  environments  (exploratory  microworlds)  and  perhaps  even 
conventional  classroom  instruction.  We  are  discussing  diagnostic  testing  to  be  used  often,  in  small 
amounts,  to  steer  the  course  of  instruction.  Further,  in  contrast  to  relatively  standard  (eg., 
pretest-treatment-posttest)  designs  for  individualizing  the  teaching  of  children,  we  focus  on 
individualizing  the  testing  process  to  make  it  more  efficient  in  steering  instruction. 

Problems  of  Diagnostic  Testing 

Any  test,  including  a  diagnostic  test,  consists  of  a  number  of  items.  The  person  being  tested 
carries  out  some  performance  of  each  of  the  items,  scores  are  assigned  to  those  performances,  and  those 
scores  are  aggregated  to  arrive  at  an  evaluation.  To  make  steering  tests,  we  need  test  items  that  are 
relevant  to  the  specific  steering  decisions  that  must  be  made  about  a  particular  student  in  a  particular 
context,  and  we  need  procedures  for  scoring  performance  on  those  items.  Steering  tests  must  be 
eflicient  to  administer,  since  steering  requires  frequent,  but  not  necessarily  precise,  feedback  (given 
the  inertia  of  teaching  and  learning,  the  steering  error  produced  by  believing  an  imprecise  test  will 
probably  be  canceled  out  by  subsequent  course  corrections). 

Standard  psychometric  methods  are  not  designed  for  steering  tests.  They  are  designed  to  assure 
that  different  forms  of  a  test  are  equivalent  and  that  the  scores  on  that  test  are  reliable.  The  problem 
of  steering  tests  is  that  they  must  be  brief,  so  that  testing  does  not  take  too  much  time  from  learning. 

This  makes  it  difficult  for  them  to  be  reliable,  and  steering  requires  at  least  some  reliability  of 
feedback  to  be  successful. 

There  are  two  ways  a  test  can  be  made  more  reliable  The  first  is  to  increase  the  extent  to  which 
performance  on  its  items  directly  reflects  the  skills  one  wishes  to  assess.  This  can  be  done  statistically 
or  substantively.  Statistical  approaches  such  as  item-response  theory  ( Lord,  1980)  help  assure  that 
different  items  are  measuring  the  same  thing,  and  thereby  increase  the  reliability  of  scores,  but  not 
necessarily  their  validity  However,  it  is  also  possible  to  develop  a  microtheory  of  the  competences  one 
wishes  to  teach.  Such  a  microtheory  can  help  in  specifying  items  that  test  particular  subsets  of  the 
target  skills. 

The  second  way  to  make  a  test  more  reliable  is  to  use  knowledge  about  the  student's  performance 
on  prior  items  to  limit  the  information  each  new  test  item  must  provide.  Ar'ri’,,  ive  testing  algorithms 
have  been  developed  for  this  purpose  They  use  a  sequential  strategy  After  the  student  completes  an 
item,  an  estimate  of  the  student's  performance  based  upon  the  items  so  far  completed  is  used  to  select 
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the  most  informative  next  item  to  administer,  and  then  the  score  on  that  next  item  is  used  to  update 
the  estimate.  The  adaptive  testing  approach,  which  almost  always  requires  a  computer  for  the 
real-time  estimates  just  mentioned,  can  be  applied  even  when  nothing  more  than  the  difTiculty 
ordering  of  items  is  known.  However,  it  is  especially  powerful  if  more  detailed  information  about  the 
items  is  available.  Again,  a  theory  that  relates  performance  on  various  test  items  to  underlying 
competences  and  their  acquisition  can  be  helpful,  even  if  it  is  incomplete. 

In  at  least  one  case,  adaptive  testing  techniques  were  applied  to  diagnostic  testing  ( Spinet  i  & 
Hambleton,  1977).  Spineti  and  Hambleton  used  learning  hierarchies  specified  by  rational  task 
analysis  (Gagne,  1965)  to  help  constrain  the  estimation  process.  That  is,  they  decided  on  items 
according  to  an  analysis  of  the  material  being  learned  and  to  some  theoretical  predictions  of  the  order 
of  acquisition  for  parts  of  that  material.  Doing  this,  they  were  able  to  achieve  a  50%  reduction  in  the 
number  of  items  required  to  achieve  a  given  level  of  score  reliability 

The  approach  we  have  taken  to  steering  testing  is  somewhat  different.  It  uses  very  simple 
heuristics  for  reasoning  about  the  level  of  a  student's  competence  in  particular  subskills.  Its  power 
derives  primarily  from  its  ability  to  intelligently  manufacture  practice  opportunities  (test  items)  for 
the  student  that  will  be  especially  revealing  about  his  current  competences  We  believe,  although  it 
remains  to  be  proven,  that  these  practice  opportunities  are  generally  appropriate  learning  vehicles  as 
well  as  test  items.  In  that  sense,  we  are  pursuing  steering  as  a  unified  system  in  which  testing  and 
learning  are  combined. 

In  our  view,  a  cognitive  theory  of  testing,  and  especially  a  theory  of  steering  testing,  should  have 
two  characteristics.  First,  it  should  permit  a  partly  logical  (in  contrast  to  a  purely  statistical) 
constraining  of  diagnosis.  Second,  it  should  be  based  on  a  representation  of  the  knowledge  that  is 
needed  to  exercise  the  skill  it  purports  to  measure.  The  logical  approach  is  not  at  all  foreign  to  our 
experience.  When  one  is  sick  and  goes  to  a  physician,  one  is  not  satisfied  with  broad  probabilistic 
statements.  Rather,  one  expects  a  diagnosis  constrained  by  the  physician's  knowledge  of  disease 
More  specifically,  we  expect  the  physician  to  be  asking  herself  what  diseases  could  produce  the  overall 
complex  of  symptoms  and  signs  presenting  themselves  to  her.  Diagnosis  in  medicine,  then,  is  the 
designing  of  a  personal  theory  of  a  specific  patient's  pathology  This  personal  theory  is  rooted  in 
theories  of  disease  mechanisms  and  not  just  in  unexplained  statistical  relationships 

The  diagnosis  process  is  dynamic.  For  example,  based  on  the  hypothesis  that  a  patient  has  heart 
disease,  the  physician  may  probe  for  more  explicit  detail  about  certain  symptoms  or  order  a  test  that 
may  confirm  or  refute  her  theory.  A  teacher  does  this  too  when  prior  knowledge  about  a  student, 
combined  with  current  observations,  leads  her  to  attribute  grammatical  errors  in  the  student's  paper 
either  to  inexperience  with  written  language  or  to  use  of  nonstandard  dialect  or  to  a  mistaken  sense  of 
when  formal  conventions  are  needed. 

The  good  teacher's  diagnosis  differs  from  that  of  a  physician  in  one  respect,  though  We  come  to  a 
physician  to  get  a  diagnosis  when  something  is  wrong  -  she  does  not  generally  shape  continuing 
decisions  about  how  we  should  act  (except  perhaps  in  developing  special  regimens,  e  g  ,  diets  for 
control  of  diabetes).  A  teacher,  in  contrast,  is  carrying  out  an  active,  goal-directed  activity  --  teaching 
-  which  needs  only  small  course  corrections  Consequently,  it  seems  reasonable  to  conduct  the  testing 
from  the  teacher's  point  of  view,  at  least  in  part. 

We  would  like  to  produce  tests  that  capture  some  of  the  capabilities  of  the  most  perceptive  and 
observant  teachers.  We  want  them  to  be  driven  mostly  from  the  teacher's  goal  structures  for  teaching 
but  also  to  respond  to  knowledge  of  the  expertise  the  teacher  is  trying  to  convey,  the  treatments 
available  to  the  teacher  for  effecting  learning,  and  certain  more  global  teacher  concerns,  such  as 
adapting  to  general  differences  in  aptitude  and  general  characteristics  of  competence  at  different 
levels  of  learning. 

In  the  next  section,  we  discuss  the  different  kinds  of  knowledge  that  are  needed  to  adapt  teaching 
to  an  individual  student's  course  of  learning  We  take  the  viewpoint  of  intelligent  tutoring  system 
design,  but  the  same  concerns  arise  in  all  approaches  to  instruction.  This  is  followed  by  sections  in 
which  a  specific  approach  to  the  generation  of  diagnostically  and  educationally  useful  problems  is 
discussed. 

Components  of  Teaching  and  Testing  Knowledge 

Several  different  kinds  of  knowledge  are  required  in  our  approach  to  steering  testing  E.-;pecially 
when  designing  computer  systems  to  teach  or  to  test,  it  is  important  to  clarify  the  know  ledge,  or 
competence,  that  is  involved  in  dealing  with  a  student  We  have  categorized  that  knowledge  into  four 
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types.  1  These  are  domain  expertise,  curriculum  knowledge,  instructional  planning  knowledge,  and 
treatment  knowledge.  Each  type  of  knowledge  has  different  structures,  different  generalized  methods, 
and  different  purpose  and  applicability.  F  urther,  there  are  a  variety  of  connections  from  one  type  of  ■ 
knowledge  to  another.  Figure  1  shows  these  four  categories  with  examples  of  the  kinds  of  knowledge 
they  contain  for  an  electricity  tutoring/testing  system  under  development  at  the  Learning  Research 
and  Development  Center. 

Domain  Expertise 

Domain  expertise  is  always  embodied  in  instructional  decision  making,  either  explicitly  or 
implicitly.  Deep  diagnosis  of  student  difficulties  may  require  an  explicit  representation  of  the 
knowledge  required  for  the  performances  that  are  the  goals  of  instruction.  For  example,  the  ability  of 
a  computer-based  tutor  to  diagnose  bugs  (systematic  errors)  in  children's  arithmetic  performances 
requires  having  a  model  of  the  algorithms  that  experts  use  in  executing  those  performances.  Also, 
feedback  on  test  performance  and  advice  to  the  student  may  have  to  be  couched  in  terms  of  procedures 
for  acting  rather  than  in  terms  of  criteria  for  outcomes  specified  in  the  curriculum.  One  way  or 
another,  the  performances  that  constitute  the  goals  of  a  curriculum  derive  from  information  about  the 
competences  that  constitute  expertise. 

Another  aspect  of  domain  expertise  that  is  important  in  instruction  and  testing  is  knowledge  of 
the  target  task  environment.  When  we  speak  of  what  it  is  we  want  [leople  to  do,  we  are  referring  not 
only  to  the  knowledge  they  need  to  perform  successfully  but  also  to  the  circumstances  under  which 
that  knowledge  must  be  employed.  Again,  knowledge  of  these  circumstances  might  be  the  basis  for 
curricular  objectives,  but  those  objectives  rest  upon  domain  expertise.  If  we  have  the  objective  that 
given  situation  X,  the  student  can  do  Y,  it  rests  upon  knowledge  of  what  kind  of  situation  X  is  and  how 
Y  can  be  done  in  X.  For  example,  a  student  might  be  able  to  solve  a  proportion  problem  at  the  time  a 
lesson  on  proportion  is  presented  but  not  be  able  to  use  that  knowledge  later  in  solving  a  word  problem 
or  even  to  solve  the  same  problem  as  one  of  a  set  on  mixed  topics.  When  testing  or  teaching  is  done  by 
a  computer  program,  the  underlying  domain  knowledge  sometimes  must  be  made  explicit. 

Curriculum  Knowledge 

Curriculum  knowledge  is  the  specification  of  the  goal  structure  that  guides  the  teaching  of  a  body 
of  expertise.  Educational  researchers  and  developers  often  treat  the  procedures  that  constitute 
expertise  and  the  instructional  goals  that  constitute  curriculum  as  more  or  less  the  same.  They 
assume  that  expertise  can  be  split  apart  easily  "at  its  joints"  (to  use  Plato’s  phrase).  The  curriculum, 
then,  is  a  natural  hierarchy  of  goals  and  subgoals  to  teach  the  natural  units  of  expertise.  From  this 
viewpoint,  curriculum  knowledge  and  domain  expertise  are  the  same  thing  However,  it  appears  that 
there  are  many  different  plans  for  splitting  apart  expertise,  especially  when  expertise  involves 
complex  performances.  For  example,  consider  the  curricular  issues  that  arise  in  teaching  simple 
electrical  principles.  There  are  some  basic  concepts  -  voltage,  current,  and  resistance  --  and  some 
basic  laws  -  KirchhofTs  Laws  and  Ohm's  Law.  In  addition,  there  are  different  types  of  circuits  -  series 
and  parallel. 

So,  one  legitimate  decomposition  of  the  subject  might  begin  with  voltage,  teaching  the  behavior 
of  voltage  in  series  and  parallel  circuits,  then  teaching  about  resistance  in  the  two  types  of  circuits, 
and  finally  treating  current.  Another  decomposition  might,  with  equal  legitimacy,  build  the  entire 
curriculum  on  KirchhofTs  current  laws.  Yet  another  view  might  treat  parallel  circuits  as  being  quite 
distinct  from  series  circuits  and  redevelop  the  concepts  of  voltage,  resistance  and  current  separately 
for  each.  We  need  to  capture  these  multiple  viewpoints  if  they  correspond  to  different  curricular  goals 
about  which  steering  information  may  be  needed.  For  this  reason,  the  various  subgoals  of  knowledge 
that  the  teacher  or  curriculum  writer  can  have  are  best  represented  by  multiple  hierarchical  goal 
structures;  these  goal  structures  overlap  in  the  components  of  expert  performance  to  which  they  refer 

Once  we  concede  that  instructional  goals  are  not  really  a  simple  decomposition  of  the  expertise 
being  taught  into  discrete  sets  and  subsets,  we  are  in  a  position  to  understand  why  some  testing  that  is 
part  of  a  curriculum  may  not  be  as  diagnostic  as  we  would  hope  Specifically,  we  can  understand  why  a 
student  might  demonstrate  clear  competence  on  a  curricular  goal  that  is  prerequisite  to  some  other 
goal  but  still  appear,  from  the  standpoint  of  the  teacher  of  that  second  goal,  to  not  have  mastered  the 
first.  For  example,  a  student  may  demonstrate  understanding  of  KirchhofTs  Current  Law  but  fail  to 
apply  it  in  a  circumstance  for  which  it  is  relevant.  Separating  expertise  from  curriculum  allows  us  to 
understand  such  situations  better. 
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Suppose  that  we  consider  domain  expertise  to  be  represented  by  a  surface.  Expert  knowledge  is, 
after  all,  highly  interconnected.  Even  if  it  is  properly  represented  as  some  kind  of  network,  it  can  be 
approximated  by  a  continuous  surface  (specifically,  a  manifold  of  unspecified  dimensionality).  We 
start  by  assuming  that  each  curricular  subgoal  corresponds  to  a  region  of  the  expertise  continuum. 

The  expertise  subset  corresponding  to  a  curricular  goal  will  likely  be  convex,  in  the  sense  that  if  two 
pieces  of  knowledge  are  part  of  the  same  curricular  goal,  then  any  strong  relationship  that  directly  ties 
them  together  should  also  be  part  of  that  goal  On  the  other  hand,  a  curriculum  goal's  corresponding 
expertise  is  not  a  completely  closed  set,  since  concepts  it  subsumes  may  well  have  connections  to  other 
knowledge  that  goes  beyond  the  goal.  That  is,  the  edges  between  the  expertise  subsets  corresponding 
to  different  curricular  subgoals  are  not  necessarily  clean  edges  with  no  connections  to  other 
knowledge. 

The  untargeted  knowledge  lying  between  the  clusters  of  expertise  directly  addressed  by  the 
curriculum  can  be  important  in  remediating  lack  of  transfer  from  a  curriculum  goal's  prerequisites  to 
the  final  target  capability. 2  Ordinarily,  instruction  is  directed  at  the  center  of  the  expertise  subset 
corresponding  to  a  curricular  goal  (see  Figure  2).  This  helps  keep  the  new  knowledge  to  be  taught 
simple  enough  to  be  learned.  However,  this  approach  can  sometimes  backfire  For  example,  if  two 
bundles  of  expertise  are  both  curricular  goals,  their  centers  may  be  well  taught  but  their  peripheries 
ignored.  For  example,  I  may  teach  you  how  to  compute  the  joint  resistance  of  two  resistors  in  series, 
and  this  may  satisfy  an  instructional  objective.  Later,  if  you  need  to  find  the  joint  resistance  of  three 
resistors  in  order  to  solve  a  problem,  you  may  be  able  to  do  that  or  you  may  not  In  either  case,  simply 
reteaching  the  two-resistor  algorithm  will  be  insufficient. 

If  a  higher-order  curricular  goal  happens  to  depend  upon  the  integration  of  the  two  lower  order 
subgoals,  it  is  exactly  the  edges  of  their  domain  knowledge  subsets  on  which  it  will  likely  depend  For 
decisions  about  what  to  teach  when  remediation  seems  necessary  and  also  for  decisions  about  how  to 
interpret  apparent  inconsistencies  in  diagnosing  whether  a  curricular  subgoal  has  been  achieved, 
domain  expertise  may  be  needed. 

Planning  Knowledge 

In  addition  to  specific  curricular  goals,  there  are  some  other  higher-order  curricular  issues  that 
need  to  be  addressed  in  planning  testing  or  teaching.  Often,  these  are  abstractions  from,  or  specialized 
viewpoints  on,  the  curricular  goal  structure.  These  may  include  learning  skills,  problem  solving 
heuristics,  rather  general  aptitudes,  and  even  preferences.  These  concerns,  e  g.,  the  more  general 
"inquiry"  skill  goals  in  a  science  course,  overlap  some  of  the  higher-level  goals  in  the  curriculum  It 
could  even  be  argued  that  these  concerns  really  are  part  of  the  curriculum,  but  we  retain  the 
distinction  since  planning  issues  often  color  the  exact  form  that  goal-specific  instruction  might  take 

For  example,  we  would  treat  as  a  planning  issue  the  complexity  of  arithmetic  computation  that  is 
required  to  solve  a  word  problem  in  a  math  course.  The  metagoal  is  for  the  student  to  be  able  to 
advance  through  the  problem-solving  part  of  the  curriculum  even  if  his  arithmetic  skills  are 
developing  more  slowly  than  his  problem  solving  skills  So,  the  arithmetic  required  in  a  word  problem 
might  be  adjusted  to  keep  it  simple  enough  to  let  new  problem  solving  skills  develop  Later,  when 
problem  solving  skills  are  strong,  the  situation  might  reverse,  and  increasingly  tough  arithmetic 
might  be  required  whenever  the  student  is  predicted  to  find  the  problem  solving  tasks  easy  Note  that 
the  issue  of  arithmetic  skills  getting  in  the  way  of  problem  solving  could  arise  in  curricula  other  than 
math,  such  as  the  electrical  networks  curriculum  sketched  in  Figures  1  and  3  It  is  for  this  rea.son 
especially  that  we  choose  to  treat  the  matter  as  a  metacurricular  planning  issue  Sometimes 
capability  on  skills  that  are  not  the  focus  of  instruction  will  require  alteration  of  instructional  and 
testing  strategies  for  target  skills.  This  is  why  instruction  and  testing  systems  need  planning  and 
metacurricular  knowledge 

The  planning  of  teaching  must  also  take  into  account  the  long-term,  higher  order  aspects  of 
education  metacognitive  skills,  mature  and  flexible  preferences,  and  fundamental  principles  that 
apply  in  many  domains  From  the  point  of  view  of  the  steering  test  developer,  though,  these 
higher -order  issues  represent,  for  the  most  part,  variables  to  be  controlled.  We  can't  really  understand 
whether  a  student  knows  how  to  solve  electrical  network  problems,  for  example,  if  his  capability  is 
hidden  by  slow  arithmetic  performance  So.  we  have  to  take  account  of  metacurricular  issues  in 
selecting  problems  for  instructional  or  measurement  use.  That  is,  problems  can  be  selected  to  require 
domain-specific  skills  but  to  assure  that  the  student  answering  a  given  problem  will  not  be  troubled  by 
weakness  on  general  basic  skills  that  are  not  the  current  focus  of  measurement  or  instruction  For 
example,  if  a  student  is  weak  in  arithmetic,  a  problem  might  be  generated  that  required  only 
small-integer  arithmetic  If  a  differ  ’  student  finds  it  easier  to  receive  information  in  graphical  form. 
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the  information  given  for  a  problem  might  be  presented  via  a  diagram,  graph,  or  even  photographic 

image.  ] 

i 

Treatment  Knowledge  | 

I 

We  turn  now  to  the  matter  of  educational  treatments  and  test  item  development.  Even  when  we  j 

know  what  to  teach  or  what  to  measure,  there  remains  a  separate  form  of  expertise  involved  in  ! 

successfully  generating  a  situation  in  which  a  piece  of  knowledge  can  be  exercised.  For  example,  ‘ 

several  different  types  of  problems  can  be  created  to  test  understanding  of  electrical  network  ! 

principles  (or  to  provide  opportunities  for  coached  practice).  Problems  can  be  quantitative  or  ' 

qualitative.  They  can  deal  with  unchanging  situations  or  can  focus  on  relative  changes  in  different  | 

measurements  of  a  circuit.  Since  electricity  knowledge  must  be  applied  in  slightly  different  ways  for  j 

each  type  of  problem,  we  could  treat  problem  type  as  a  curriculum  issue.  However,  the  knowledge  an  ' 

intelligent  system  needs  about  problem  categories  is  different  in  form  from  knowledge  about  j 

curricular  goals.  This  is  especially  the  case  when  we  want  to  develop  problems  for  practice  or  for  i 

steering  tests  that  require  integrated  use  of  several  different  skill  components  that  are  separate  | 

curricular  goals.  The  knowledge  needed  to  develop  such  problems  is  specific  to  electricity  and  to  the  ! 

teaching  of  electricity.  I 

Practice  and  testing  that  requires  multiple  skills  to  be  combined  is  an  important  goal  of  our  work  ' 

A  contrasting  approach  is  taken  in  some  formal  instructional  development  methodologies  such  as  the 
Defense  Department's  ISD  (Merrill  &  Tennyson,  1977)  approach.  As  generally  used,  that  approach 
consists  of  complete  development  and  elaboration  of  the  curriculum  followed  by  the  development  of  | 

tests  and  treatments  corresponding  to  each  curricular  goal.  This  seems  entirely  sensible,  an  extension 
ofa  management-by-objectives  approach.  However,  if  this  method  is  applied  superficially,  difficulties  j 

can  arise.  We  have  already  discussed  the  problem  of  too-narrow  focusing  on  core  concepts  without  I 

adequate  elaboration  and  qualification,  but  there  are  other,  related  problems  as  well.  For  example,  a 
variety  of  apprenticeship  situations  involve  simultaneous  practice  of  a  wide  range  of  skill  components, 
only  some  of  which  may  be  the  current  targets  of  instruction.  When  practice  is  provided  on  each  skill 
component  separately,  without  attention  to  when  each  should  be  used  and  how  they  tie  together, 
fragmentary  learning  results.  The  instructor  can  show,  on  academic-style  tests,  that  the  student 
learned  each  subskill  that  was  to  be  taught,  but  the  subskills  cannot  be  put  together  to  solve  | 

real-world  problems. 

This,  of  course,  is  a  viewpoint  that  has  been  taken  before  In  the  world  of  reading  instruction,  for 
example,  we  have  just  seen  a  long  period  in  which  holistic  approaches  have  been  taken.  Similarly, 
case  study  approaches  to  the  teaching  of  medicine  and  business  are  driven  by  the  same  motivation 
There  is,  of  course,  some  evidence  against  holistic  approaches  For  example,  Chall  ( 1967)  surveyed  a 
number  of  reading  curricula  and  found  that,  on  average,  weaker  students  benefited  from  a  phonics  I 

approach,  in  which  recognition  of  each  individual  grapheme  was  the  focus  of  separate  instruction.  In 
the  professional  world,  it  is  regularly  asked  how  we  can  be  sure  that  a  student  who  took  a  case  study 
course  really  learned  everything  he  should  have  "What  if  I  get  a  disease  that  was  not  one  of  the  cases 
discussed?" 

We  can  be  a  bit  more  formal  about  this  problem  if  we  view  subskills  as  productions,  actions  to  be 
performed  under  specific  conditions  When  subskills  are  taught  in  isolation,  the  conditions  under 
which  they  should  apply  cannot  be  specified,  since  those  conditions  relate  to  the  broader  context  of 
holistic  performances  Also,  there  may  be  specific  productions  that  are  not  represented  as  subgoals  for 
instruction  but  that  are  the  "glue"  needed  to  combine  the  productions  that  were  direct  curricular 
targets. 

An  instructional  synthesis  of  the  holistic  and  componential  approaches  requires  several  things, 
including  an  understanding  of  the  circumstances  under  which  new  subskills  or  concepts  shouid  be 
introduced  in  isolation  even  if  they  are  later  to  be  practiced  more  holistically  Of  course,  the  missing 
productions,  the  "glue"  that  holds  together  the  subskills  we  target  in  our  curriculum,  cannot  be  taught 
adequately  in  vitro;  they  require  holistic  instruction  The  dilemma  is  that  they  also  need  to  be 
assessed.  We  may  need  to  help  students  attend  to  "gluing"  their  fragmentary  knowledge  together  if 
they  have  trouble  doing  so  on  their  own.  F urther,  we  may  not  always  choose  to  introduce  new  pieces  of 
knowledge  formally  and  explicitly,  hoping  that  they  will  be  inferred  through  rich  domain  experience 
If  we  take  this  approach,  which  may  be  very  efficient,  we  need  to  be  able  to  assess  later  whether  there 
are  any  subgoals  that  were  not  well  attained 

The  basic  approach  we  have  taken  is  to  generate  test  items  (and  instructional  treatments,  for 
that  matter)  in  the  course  of  testing  That  is,  at  any  given  point  in  the  course  of  testing,  if  a  question 
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arises  about  a  specific  curricular  goal,  a  test  item  is  generated  for  it  by  an  intelligent  subsystem  of  the 
tutoring  program  we  (primarily  the  second  author)  are  developing  The  item  can  be  shaped  by 
metacurricular  considerations.  F urther,  if  multiple  skills  are  required  for  any  realistic  performance 
within  the  domain,  sets  of  items  can  be  developed  over  which  particular  subskill  requirements  are 
systematically  varied. 

So,  our  approach,  given  a  family  of  cognitive  analyses  (of  e.vpertise,  metacurricular  issues,  and 
problem  environments  in  which  the  expertise  can  be  manifested  or  practiced),  is  to  intelligently 
generate  the  equivalent  of  a  controlled  experiment  in  which  the  need  for  various  target  pieces  of 
knowledge  is  systematically  varied.  If  the  student  fails  to  perform  items  requiring  a  piece  of 
knowledge  but  does  perform  other  items  that  do  not  require  it.  then  we  infer  that  work  is  needed  on 
that  knowledge.  Further,  we  ask  only  about  pieces  of  knowledge  that  are  in  the  part  of  the  curriculum 
through  which  we  are  steering.  Finally,  rather  than  make  statistical  decisions  about  whether  a  piece 
of  knowledge  is  present  or  absent,  we  assume  that  knowledge  can  be  present  at  various  strength  levels 
and  use  experience  about  the  reliability  with  which  a  particular  piece  of  knowledge  manifests  itself  to 
specify  the  level  of  learning  of  that  knowledge 

Summary  Perhaps  the  best  way  to  illustrate  the  ideas  just  presented  is  to  refer  back  to  the 
example  given  above  Figure  3  elaborates  the  knowledge  categories,  in  part,  for  our  system  to  teach 
and  test  basic  electricity  principles.  The  curriculum  knowledge  includes  three  sets  of  goals  laws, 
concepts  and  architectures  Under  each  of  these  are  subgoals  For  example,  the  architectures  being 
considered  are  series  and  parallel  circuits  (i  e  ,  no  bridge  circuits).  The  planning  knowledge  includes 
two  sets  of  planning  concerns:  the  arithmetic  difficulty  of  problems  that  are  presented  to  the  student 
and  the  circuit  complexity  Both  apply  with  respect  to  a  variety  of  curricular  subgoals  For  example, 
circuit  complexity  may  affect  whether  a  student  can  handle  parallel  circuits,  whether  he  can  apply 
KirchhofTs  current  law,  etc  Arithmetic  difficulty  could  also  affect  these  subgoals,  especially  if 
quantitative  problems  are  presented  to  the  student  The  treatment  knowledge  includes  information 
on  problem  formats  and  feedback  to  the  student  Finally,  the  domain  expertise  contains  specific 
details  of  expertise  in  handling  electrical  networks  that  are  referenced  by  the  curriculum 
sp>ecification 

Generating  Test  Items  from  a  Student  Model 

Having  described  the  architecture  of  the  knowledge  in  a  steering  testing  system,  we  turn  now  to 
how  one  uses  that  knowledge  to  do  assessment  driven  by  a  cognitive  model  of  the  target  capabilities 
being  taught.  We  offer  as  a  first  approximation  an  approach  that  has  been  tested  in  prototype  form  in 
an  intelligent  tutor  It  assumes  additional  knowledge  that  we  have  not  yet  discussed:  a  student 
model,  some  sort  of  knowledge  structure  specifying  which  subskills  the  student  is  thought  to  know  and 
which  ones  not 

We  currently  specify  the  student  model  by  embedding  it  in  the  curricular  goal  structure  of  an 
intelligent  tutor  For  each  curricular  subgoal,  there  must  be  some  sort  of  notation  about  the  student's 
assumed  competence  relative  to  that  subgoal  In  one  tutor  the  first  author  and  his  colleagues  are 
building  (Lesgold,  Lajoie,  et  al  .  1986),  there  are  only  four  notations  -  unlearned,  perhaps  acquired. 
pro6a6/y  acquired,  and  reliably  strong  These  notations  relate  to  an  underlying  cognitive  model  of 
learning  derived  from  John  .Anderson's  (1983)  work  The  rules  currently  used  to  change  a  subskill 
notation  from  one  state  to  another  are  quite  rough,  but  they  are  principled 


-Movement  to  the  probably  learned  state  implies  that  a  correct  pr>xfuction,  or  set  of  productions,  is 
assumed  to  have  been  developed  by  the  student  The  perhaps  state  indicates  that  the  student  has  been 
observed  to  perform  the  target  skill  component,  but  that  there  is  insufficient  evidence  to  conclude  that 
he  knows  the  conditions  as  well  as  the  actions  for  the  subskill  The  perhaps  state  is  unstable  Either 
further  correct  performances  will  occur,  prompting  classification  to  the  probably  state,  or  we  w  ill 
assume  that  the  single  correct  performance  observed  was  accidental  relative  to  the  problem  ecology  for 
the  curriculum,  and  the  student  will  be  moved  back  to  the  unlearned  state  Recurrent  reliable 
performance  will  move  a  student  from  probably  to  strong  One  can  imagine  other  approaches  in  w  hich 
the  notations  might  include  indicators  of  misconceptions  as  well  The  important  point  is  that  if  we 
look  in  on  a  student  who  is  in  the  midst  of  learning  a  skill,  some  of  the  subskills  will  be  clearly 
demonstrated  already,  some  will  be  manifesting  obvious  problems,  some  will  be  unlearned,  and  some 
will  be  in  an  unknown  status 

If  we  consider  how  to  diagnose  student  progress  in  a  holistic  practice  environment  given  a 
current  student  model  state,  we  see  that  a  first  issue  to  be  addressed  is  what  to  test  In  principle,  the 
student  could  have  learned  anything  since  we  last  tested  him  or  her  For  that  matter,  any  prior 
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demonstration  of  competence  might  have  been  a  fluke,  so  all  positive  entries  in  the  student  model  are 
tentative.  Nonetheless,  it  would  make  no  use  of  the  student  model  at  all  if  we  merely  tested  for  every 
skill  component  at  every  opportunity.  The  student  model  enables  testing  for  selected  skill  components 
efficiently  and  in  realisUc  performance  contexts.  It  is  the  equivalent  for  steering  testing  of  the 
patient's  chart  for  medical  diagnosis. 

We  want  to  use  the  student  model  to  generate  constraints  on  the  problems  we  pose  to  the  student 
as  test  items.  These  constraints  should  have  the  property  that  they  make  the  items  maximally 
informative  in  tuning  the  student  model  to  changes  in  the  student’s  capabilities.  What  can  guide  our 
choices  of  curricular  goals  to  test?  There  are  several  possibilities.  We  discuss  them  in  terms  of  the 
four-level  model  of  acquisition  mentioned  above  {Unlearned,  Perhaps  learned,  Probably  learned,  and 
Strong).  The  Perhaps  stage  may  be  the  most  volatile.  Suppose  a  curricular  goal  to  be  the  attainment 
of  a  specific  production  (carrying  out  a  particular  action  when  appropriate)  When  the  action  is 
initially  performed  and  is  successful,  there  is  a  considerable  chance  that  the  student  may  not  notice 
the  most  important  cues  about  the  circumstance  of  the  moment.  So,  he/she  may  be  unable  to 
demonstrate  the  production  in  other  circumstances.  For  all  practical  purposes,  it  was  never  really 
learned  at  all.  Till  we  have  several  demonstrations  of  the  attainment  of  a  curricular  goal,  we  must 
assume  that  our  assessment  of  the  student  is  unstable.  Once  we  see  multiple  successful  performances, 
we  will  reclassify  the  student's  competence  to  the  Probably  level.  So,  a  first  principle  in  selecting 
current  curricular  goals  to  test  is  to  be  sure  to  check  upon  goals  in  the  Perhaps  state 

A  second  issue  has  to  do  with  prerequisite  skills  If  Skill  A  depends  upon  Skill  B.  then  there  is  no 
point  in  regularly  testing  for  A  until  B  is  demonstrated.  Put  another  way,  if  there  is  ordering 
information  about  the  curricular  goals,  we  may  want  to  concentrate  testing  on  the  region  in  the 
ordering  between  the  goals  in  the  Strong  state  and  those  in  the  U nlearned  state,  testing  most  often  the 
Perhaps  goals,  checking  for  progress  on  the  next  few  Unlearned  goals,  and  checking  occasionally  to  see 
if  any  goals  have  gone  from  Probably  to  Strong  (operationally,  we  check  to  see  if  problems  requiring 
this  subgoal's  skills  are  answered  correctly  for  several  consecutive  occasions  with  varying 
requirements) 

The  next  issue  involves  metacurricular  concerns,  especially  those  relating  to  extraneous  sources 
of  difficulty,  such  as  requiring  complicated  arithmetic  performance,  presenting  information  in  a 
medium  known  to  be  difficult  for  the  student,  etc.  The  basic  rule  of  thumb  we  propose  is  to  adapt  these 
difficulty  variables  to  the  current  student  model  level.  For  example,  if  the  goal  is  to  detect  a 
movement  from  Unlearned  to  Perhaps  for  some  curricular  goal,  then  we  want  to  set  the  metacurricular 
difficulty  levels  low,  so  that  the  initial  weak  acquisition  of  that  subgoal’s  knowledge  is  not  masked  by 
too  many  other  demands  for  processing  capacity  For  movement  from  Perhaps  to  Probably,  an 
appropriate  problem  constraint  is  to  have  some  situational  changes  from  the  problem  in  which  the 
initial  appearance  of  the  relevant  knowledge  was  first  noted,  since  the  theoretical  motivation  for  the 
distinction  is  the  possibility  of  the  correct  actions  having  been  linked  to  imprecise  conditions.  For 
validating  movement  to  Strong  on  some  goal,  there  should  be  a  demonstration  of  the  relevant 
capability  under  more  difficult  circumstances,  since  the  question  is  whether  the  relevant  knowledge  is 
robust  enough  to  occur  even  under  adverse  conditions. 
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The  Concept  of  Constraint  Posting 

The  basic  approach  is  to  begin  each  cycle  of  diagnosis  by  sweeping  through  the  curricular  goal 
structure,  noting  which  subskills  are  "ripe"  for  testing.  When  the  sweep  is  completed,  we  try  to  build 
one  or  more  problems  that  maximize  our  chances  for  accurately  noting  changes  in  the  student's 
current  knowledge  state,  using  some  of  the  rules  of  thumb  just  described  We  then  use  performance  on 
these  made-to-order  problems  to  decide  how  to  update  the  student  model  --  we  make  a  diagnosis 

Critical  to  the  approach  is  the  concept  of  constraint  posting  tStefik.  1980)  Rather  than  building 
test  items  as  we  sweep  through  the  curricular  goal  structure,  we  instead  simply  add  to  a  list  of  item 
constraints  as  we  proceed.  Each  time  we  see  an  issue  on  which  we  would  like  more  clarity,  we  post 
that  concern  as  a  constraint  on  the  test  item  generation  process.  When  the  sweep  through  the 
curriculum  is  complete,  we  take  the  bundle  of  constraints  and  try  to  build  items  that  satisfy  them 
Steflk  (1980)  has  shown  that  in  many  complex  problem  solving  tasks  involving  multiple  sources  of 
complexity  and  interactions  between  problem  aspects  (e  g  ,  designing  recombinant  DN  A  experiments), 
this  constraint  posting  approach  is  much  more  efficient  than  piecemeal  search  proces.ses 

Constraint  Posting  Applied  to  Problem  Generation 

The  item  generation  process,  then,  can  work  as  follows.  We  first  consider  the  student  model 
Some  of  the  subskills  may  be  marked  as  reliably  strong  These  represent  beachheads  in  the  conquest 
of  ignorance.  From  these  beachheads,  as  we  venture  out  toward  related  subskills,  we  find  some  whose 
status  is  uncertain  (subskills  that  may  or  may  not  have  been  acquired  yet  and  acquired  subskills  that 
may  or  may  not  be  reliable  yet)  We  can  make  this  search  process  more  efiicient  if  we  know,  for  some 
subgoals,  which  other  subgoals  are  prerequisite  to  them  and  which  they  are  prerequisite  for  A 
subgoal  for  which  a  just  attained  subgoal  is  prerequisite  is  likely  to  be  a  testing  target,  but  we  will  also 
give  some  weight  to  all  subgoals,  using  the  rules  of  thumb  discussed  above  Since  we  are  making 
steering  decisions,  we  focus  on  the  area  of  the  curriculum  that  is  currently  the  object  of  instruction 
For  each  subgoal  that  is  a  current  target  of  testing,  at  least  one  constraint  is  posted  a  test  problem 
must  address  that  subgoal  For  example,  if  we  want  to  find  out  whether  the  student’s  capabilities  in 
applying  Ohm's  Law  to  series  circuits  have  improved,  we  post  constraints  that  the  problem  must 
require  Ohm’s  Law  and  must  involve  a  series  circuit. 

We  must  also  consider  metacurricular  planning  issues.  For  example,  a  part  of  the  system's 
planning  component  may  address  the  question  of  whether  or  not  a  physics  student  has  adequate  math 
facility,  or  whether  or  not  a  student  is  able  to  learn  information  from  graphical  presentations 
Constraints  can  be  posted  based  on  metacurricular  aspects  of  the  student  model,  too  We  may, 
essentially,  say  to  the  test  generator.  "Since  this  student  is  poor  in  arithmetic,  1  can't  find  out  if  he  has 
learned  (moved  from  unlearned  to  perhaps)  how  to  use  Ohm’s  Law  to  compute  the  current  in  a  circuit  if 
the  arithmetic  comes  out  messy,  so  make  the  numbers  come  out  simple  " 

Once  the  sweep  through  the  curricular  and  planning  structures  is  complete,  the  posted 
constraints  must  be  analyzed  before  test  items  are  generated.  Are  there  too  many  to  handle  at  once'’  If 
so,  we  might  partition  them  into  .several  clusters.  Are  the  constraints  inconsistent,  in  the  sense  that  a 
problem  embodying  some  of  them  cannot,  in  principle,  embody  the  others’’  For  example,  if  we 
constrain  an  electricity  problem  to  be  simple  and  we  want  to  know  both  whether  a  student  knows  how¬ 
to  deal  with  two  resistors  in  series  and  also  whether  he  knows  how  to  deal  with  two  in  parallel,  this 
cannot  all  be  done  with  one  circuit  problem  So,  again,  we  might  partition  the  constraints  into  bundles 
that  can  comfortably  be  handled 

Finally,  one  or  more  holistic  problems  that  satisfy  the  constraints  posted  must  be  posed  From 
performance  on  a  problem,  either  a  diagnosis  can  be  made  immediately  or  a  more  focused  problem  can 
be  specified  for  further  testing  In  essence,  we  are  dealing  with  a  qualitative  proce>s  that  has  many  of 
the  properties  of  one  of  psychometrics'  most  important  quantitative  processes  adaptive  testing 

An  Example  from  a  Tutor  for  Basic  Electricity  Principles 

To  illustrate  some  of  these  ideas,  we  describe  .MMO,  a  tutor  that  teaches  basic  electrical 
principles  (current,  voltage,  and  resistance.  KirchholTs  Laws  and  Ohm's  Law  i  MHO  l-  dcr-igned  to 
work  in  both  a  problem-posing  and  an  exploration  mode  In  the  exploratory  mode,  the  >tudent  can 
make  measurements  on  circuits  and  even  build  his  own  circuit  In  the  didactic  mode,  though.  MHO 
must  decide  what  problem  to  present  to  the  student  Thus,  it  faces  the  same  problem  that  a  testing 
program  would  face  to  examine  the  student  model  and  determine  which  problem  to  pose  to  optimize 
the  information  value  of  the  student's  answer 
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MHO's  student  model  is  a  specialized  form  of  checklist;  a  goal  structure  for  teaching  the  specific 
knowledge  it  wants  to  teach.  The  checklist  derives  from  the  curriculum  and  planning  issues  shown  in 
Figure  3  above.  For  each  subgoal,  the  student  is  marked  as  being  in  one  of  the  four  states  described 
above,  as  shown  in  Table  1.  Quantitative  scores  could  be  entered  as  well.  What  is  critical  is  that  some 
student  knowledge  levels  are  considered  to  indicate  potential  for  change  while  others  are  not  For 
example,  a  student  who  knows  certain  material  is  not  likely  to  suddenly  stop  knowing  it,  but  a  student 
who  has  yet  to  learn  some  material  is  in  a  more  changeable  state. 

From  the  subgoal  scores  and  other  knowledge,  such  as  curricular  sequencing  and  prerequisite 
relationships,  it  is  possible  to  define  a  set  of  subgoals  that  are  most  unstable.  These  are  the  subgoals 
that  may  require  more  frequent  measurement  in  order  for  instruction  to  be  steered  well  As  discussed 
above,  they  represent  the  front  along  which  instruction  is  progressing  through  the  curriculum  goal 
structure.  The  task  of  a  test  item  generator,  then,  is  to  generate  a  test  item  that  will  be  especially 
informative  about  this  front.  MHO  does  this  by  pjosting  a  set  of  constraints  for  the  test  problem  In  the 
student  model  given  above,  the  Series,  KirchhofTs  Law,  and  Current  subgoals  are  at  this  front  Each 
constraint  helps  adapt  the  steering  feedback  to  the  student's  current  state  To  see  how  this  is  done,  we 
need  to  consider  MHO's  architecture  and  the  subject  matter  that  it  teaches  and  tests 

Architecture 

At  this  time,  MHO  teaches  and  tests  several  levels  of  DC  circuits.  It  poses  problems  such  as  the 
one  shown  in  Figure  4.  We  call  the  architecture  used  in  MHO  the  Bite-Size  Architecture.  It  is  an 
object-oriented  architecture  for  intelligent  tutoring  systems. 3  An  object  is  a  semiautonomous  piece  of 
computer  program  that  can  be  called  upon  to  achieve  particular  goals.  It  includes  both  data  structures 
and  procedural  capabilities  Object-oriented  programming  involves  designing  sets  of  objects  that  can 
efficiently  interact  to  solve  problems  Each  curriculum  subgoal  (and  also  each  metacurricular 
planning  issue  and  each  problem  format)  is  represented  by  an  object  called  a  "bite."  Within  the 
computer  program,  a  bite  contains  a  record  of  the  student's  performance  on  a  subgoal  and  the 
knowledge  needed  to  post  a  constraint  for  that  subgoal. 

Voltage,  for  example,  is  represented  by  a  bite  in  MHO  That  bite  has  rules  for  teaching  about 
voltage.  It  contains  information  pertinent  to  developing  an  understanding  of  what  voltage  represents, 
including  the  constraints  it  should  post  to  create  relevant  problems  Also,  it  can  update  the  student 
model  information  by  noting  how  the  student  does  on  problems  relevant  to  its  subgoal  One  byproduct 
of  this  architecture  and  the  curricular  model  on  which  it  is  based  is  that  a  tutoring  program's 
knowledge  is  modular  and  can  easily  be  expanded  by  adding  additional  curricular  object.s  along  w  ith 
their  pointers  to  the  other  knowledge  components  (which  may  involve  additions  to  those  components 
as  well).  For  example,  MHO's  designers  are  now  expanding  it  to  include  curricular  goals  involving 
simple  alternating  current  circuits. 

Problem  Generation 

MHO  poses  problems  by  presenting  a  circuit  diagram  and  asking  a  question  about  it  The 
machinery  used  in  problem  generation  chooses  mostof  the  circuit  components  randomly,  but  it  is 
constrained  by  both  general  and  specific  curricular  subgoals  (bites)  which  the  student  has  not  yet 
mastered.  Some  of  the  choices  represented  by  these  constraints  are  the  following 

a.  A  problem  can  be  posed  in  qualitative,  quantitative  or  relative  form 

b.  The  problem  can  vary  in  the  complexity  of  the  arithmetic  it  requires  and  the  complexity  of 
the  circuit  diagram  to  which  it  refers  This  is  determined  by  a  global  assessment  of  how  much  of  the 
curriculum  the  student  has  mastered 

c.  The  problem  can  require  knowledge  of  Ohm's  Law  or  either  of  KirchhofT  s  Laws 

d.  The  problem  can  focus  on  voltage,  current  or  resistance 

e.  The  problem  can  focus  on  series  or  parallel  circuit  topologies  i  MHO  also  worries  about 
where  the  meters  are  placed  in  circuit  diagrams,  since  there  are  some  placements  that  students  have 
particular  difficulty  handling,  but  we  ignore  that  matter  to  make  presentation  of  the  basic  approach 
more  straightforward). 

The  product  of  constraint  posting  is  stored  as  a  list  structure  (see  P'ootnote  3)  to  be  used  as  the 
basis  for  problem  generation  and  problem  solving  This  list  structure  contains  information  that 
specifies  how  to  create  a  circuit  and  a  problem  based  on  that  circuit,  what  the  circuit  should  look  like. 
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and  what  electronic  concepts  are  relevant.  An  example  of  such  a  list,  derived  from  the  student  model 
shown  in  Table  1)  is: 

[U  ((((Rel  Simple)  ($  Kirchhoff))  ($  I  =  Series))  ( UninterruptedS))  Series). 

This  list  represents  the  constraints  that  have  been  posted  in  sweeping  the  model  shown  in  Table 
1  and  is  the  starting  point  for  automatic  generation  of  a  problem.  Rel  stands  for  a  Relative  problem 
that  will  pose  a  simple  question  asking  if  two  areas  of  the  circuit  will  have  the  same  measurement  ( in 
this  case,  current).  Simple  specifies  the  student's  level  of  general  understanding  and  will  cause  the 
circuit  to  be  very  simple  in  structure.  Kirchhoff  is  the  law  this  problem  centers  around.  I  —  Senes  is  a 
specialization  of  Kirchhoffs  law,  that  current  is  equal  at  all  points  in  a  series  array  UninterruptedS 
informs  the  problem  generator  that  one  meter  should  appear  next  to  another  with  no  other 
components  between  them  (this  is  the  simplest  form  for  a  problem  looking  at  KirchhofTs  Law). 
Constrained  by  this  information  the  problem  generator  can  develop  many  different  circuits  and  pose 
many  different  problems  about  them,  so  it  is  quite  plausible  to  do  as  much  steering  testing  as  any 
student  requires  and  also  to  give  students  sets  of  appropriate  problems  as  homework. 

At  the  next,  more  elaborated,  level  of  representation  the  circuit  is  designated  as  a  network  of 
resistors,  a  combination  of  series  and  parallel  subnets  with  a  power  source  A  more  detailed  list  breaks 
this  circuit  into  four  nodes,  each  of  which  represents  a  side  of  a  rectangular  circuit  The  nodes  are 
created  separately  and  then  put  together  to  make  up  a  circuit.  One  at  a  time,  the  nodes  are  passed  into 
a  recursive  function  called  MakeCircuitString  to  be  elaborated  further  .MakeCircuitString  makes 
decisions  such  as  how  many  resistors  are  placed  on  a  node,  and  whether  these  resistors  should  appear 
in  a  parallel  or  series  net.  These  decisions  are  based  on  the  information  from  the  first  list 

Simple  instructs  MakeCircuitString  to  limit  the  number  of  resistors  that  appear  and  to 
otherwise  make  the  circuit  conform  to  the  specifications  of  a  simple  circuit  The  Simple  specifications 
keep  the  components  that  will  be  drawn  to  a  minimum.  Simple  also  informs  MakeCircuitString  that 
depending  on  what  net  we  are  working  with  all  nodes  should  be  of  this  kind  I  =  Senes  specifies  the 
net  to  be  used:  all  sides  are  series  arrays.  If  this  were  a  Difficult  problem,  some  sides  might  have 
parallel  subnets  and  others  series.  An  example  of  a  simple  circuit,  1 1 1,  that  has  passed  through 
MakeCircuitString  is 

[2|  ((VoUageSource)  (Series  (Resistor)  /Resistor))  (Parallel  (Resiston  ( Resistorn  iWire)). 

Figure  5  below  shows  the  circuit  designated  by  [21 

The  final  specifications  development  step  is  determining  what  problem  should  be  posed  about  the 
circuit,  where  meters  should  be  placed  and  what  question  should  be  asked  about  them  This  step 
requires  some  information  from  the  first  list,  e  g  [1 1.  /  =  Senes  reveals  whether  current  or  voltage  is 
the  target  concept,  while  U ninterruptedS  holds  information  pertaining  to  how  many  problems  and 
where  problems  should  appear  Several  recursive  functions  tear  apart  the  second  list  and  insert 
problem  information  ( mainly  meters)  where  it  is  best  suited.  L'sing  the  above  example  and  placing 
several  meters  into  the  list,  one  example  of  the  next  stage  is 

131  ((Problem  Rel  current  after  on  (VoUageSource))  (Series  ( Resistor)  (Problem  Ret  current  before 
off  (Resistor))  (Parallel  (Problem  Rel  current  after  on  (Resistor  /  (Resistor))  (Wire)) 

This  list  is  then  passed  to  an  intelligent  problem  developer,  which  composes  and  draws  the 
circuit.  Figure  6  below  shows  a  display  correspxmding  to  (3).  The  question  posed  to  the  student  will 
end  up  being,  "Is  the  current  at  Meter  A  higher,  lower,  or  the  same  as  the  current  at  Meter  B'"' 

The  Simulator  assigns  values  to  the  components,  i  e  resistance  and  voltage,  and  then  find.';  the 
dependent  values,  i  e  current,  voltage  drops  over  resistors,  etc.  It  can,  for  simple  problems,  ensure 
that  all  the  values  for  current  and  voltage  will  be  integral,  and  also  can  determine  whether  or  not 
resistors  and  voltage  sources  should  be  displayed  If  the  circuit  were  more  complex,  an  iterative 
propagation  would  occur  next.  Resistance  for  a  subnet  of  a  complex  circuit,  for  example,  would  bp 
calculated  by  asking  each  subnet  component  its  resistance  and  then  adding  them  together  Parallel 
structures  are  handled  recursively  as  well,  using  the  appropriate  formulae 

The  Softness  of  Student  Classifications 

We  conclude  by  reconsidering  more  broadly  the  issue  of  diagnostic  assessment  of  cognitive  skills 
to  steer  instruction  Fundamentally,  cognitive  skill,  like  physical  skill,  often  requires  substantial 
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practice  of  its  basic  components  in  the  contexts  in  which  they  are  to  be  applied.  Actions  can  be  learned 
without  learning  the  exact  conditions  for  which  they  are  appropriate.  Newly  learned,  and 
consequently  weak,  knowledge  can  fail  to  be  used  because  stronger  but  incorrect  knowledge  is 
overgeneralized  from  related  situations.  Processing  capacity  demands  due  to  one  subskill  may  be  so 
great  as  to  make  the  execution  of  another,  newly  formed  subskill  impossible.  This  means  that  for  most 
of  the  course  of  learning,  a  fundamental  principle  is  true: 

One  cannot  be  sure  a  subskill  has  not  been  learned  just  because  it 
was  not  demonstrated  on  an  occasion  where  it  should  have  been. 

On  the  other  hand,  cognitive  skill,  like  physical  skill,  is  partly  redundant.  Weak  methods  can 
sometimes  overcome  the  lack  of  appropriate  domain  knowledge.  Sometimes,  a  problem  that  in  theory 
should  require  a  particular  subskill  is  solved  correctly  by  accident.  The  correct  action  may  be  taken 
with  incorrect  knowledge  of  the  conditions  under  which  it  is  appropriate,  or  an  incorrect  action  may 
turn  out  to  be  "safe"  this  time  only.  This  leads  to  a  second  fundamental  principle. 

One  cannot  be  sure  a  subskill  is  completely  learned  just  because  it 
has  been  demonstrated. 

These  two  principles  suggest  that  the  steering  approach  to  diagnostic  testing,  in  which  local 
microtesting  is  embedded  in  the  curriculum  to  steer  instruction,  is  a  more  valid  approach  than  the 
broader  diagnostic  testing  that  has  become  part  of  many  current  monitoring  programs  in  our  .schools. 
By  asking  broad,  generic  questions  (e  g.,  "What  can  I  diagnose  knowing  nothing  about  the  student  in 
advance  and  giving  only  a  general  test?")  we  can  get  only  broad,  generic  answers  That  is.  we  can 
know  how  well,  in  general,  learning  is  proceeding,  but  we  can't  steer  specific  children's  education  with 
such  broad  indicators,  any  more  than  we  could  steer  a  ship  if  all  we  had  was  an  hourly  account  of  how 
close  to  the  correct  path  we  were. 

Empirical  experience  and  cognitive  theory  tell  us  that  an  inherent  property  of  cognitive 
performance  is  that  it  is  unreliable  unless  substantial  practice  has  occurred  and  that  success  can  come 
for  multiple  reasons.  These  factors  have  to  be  taken  into  account  in  diagnosis.  Ironically,  perhaps,  the 
less  reliable  steering  testing  approach  provides  better  steering  capability  than  the  highly  refined 
approaches  used  in  current  psychometric  efforts  at  diagnosis.  But  this  is  no  different  than  the  irony 
that  continuously  knowing  approximately  where  you  are  affords  better  steering  capability  than 
occasionally  knowing  how  well  you  are  steering,  in  general 

The  field  of  testing  has  worked  to  try  to  become  efficient  at  making  precise  estimates  from 
inherently  unreliable  data,  and  it  has  done  very  well  at  this  Approaches  such  as  item-response  theory 
and  adaptive  testing  have  allowed  the  broad  and  vague  measures  that  tests  provide  to  be  made  ever 
more  efficiently  Further  progress,  and  especially  progress  in  steering  testing  (as  opposed  to 
certification  and  selection  testing)  will  depend  on  better  use  of  information  we  already  have,  or  can 
readily  get,  about  the  cognitive  requirements  of  the  performances  and  student  competences  relative  to 
those  performances  that  interest  us  Like  the  physician,  we  will,  in  steering  the  course  of  a  child's 
education,  be  better  guided  by  sketchy  data  tied  to  specific  theoretical  analysis  than  by  precise,  but 
general,  indicators. 

Our  approach  can  be  contrasted  to  the  steering  forms  used  in  the  curricula  that  grew  from  Bob 
Glaser's  work  on  individualized  instruction.  There,  the  steering  idea  was  also  used  Fiowever,  the 
technology  of  the  time  did  not  permit  more  than  a  short,  uniform  mastery  test  after  each  lesson  This 
allowed  adequate  teaching  of  the  higher-aptitude  student  but  did  not  handle  the  remediation  problem 
discussed  above.  That  is,  it  suffered  from  having  to  treat  each  curricular  goal  and  its  corresponding 
student  capability  as  separable  from  every  other,  and  it  could  not  handle  the  problem  of  core  learning 
without  fringe  transfer  There  was  much  discussion  during  the  period  of  that  curriculum  development 
about  having  remediation  that  was  more  than  just  doing  the  same  thing  again  The  present  approach 
to  steering  testing,  which  permits  adaptation  grounded  in  cognitive  analysis  of  the  instructional 
domain,  rests  on  the  goal  structure  for  educational  research  established  during  the  period  of  work  by 
Bob  Glaser  and  his  colleagues  on  individually-prescribed  instruction 
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Un  Lesgold  (in  press)  a  three-category  model  was  presented  Since  then,  we  have  become 
convinced  that  the  curriculum  and  treatment  categories  should  be  separated 

2This  issue  is  addressed  more  completely  in  Lesgold  (in  press) 

3See  Bonar,  Cunningham  and  Schultz,  1986.  for  a  description  of  .An  Object-Oriented  .Architecture 
for  Intelligent  Tutoring  Systems  MHG  is  implemented  in  Loops,  Xerox's  proprietary  object  oriented 
specialization  of  the  standard  artificial  intelligence  language  Lisp  The  graphics  and  student  interface 
are  handled  via  an  interface  package  called  Chips  Chips  is  a  program  developed  at  the  Learnir.g 
Research  and  Development  Center,  primarily  by  John  I)  Corbett  and  Robert  F  Cunningham,  with 
some  contribution  by  Andrew  D  Bowen  The  Chips  tools  allow  circuit  displays  to  be  designed  s.)  the 
student  can  click  the  mouse  *3  mouse  is  a  pointing  device  that  causes  a  marker  to  move  on  the  screen 
as  the  device  is  moved  on  a  table  top.  it  often  contains  buttons  as  well,  so  that  the  computer  user  can 
point  to  an  object  on  the  screen  by  moving  the  marker  over  that  object  and  then  pressing  a  button  1  on 
any  of  the  components  and  thereby  cause  a  menu  of  query  options  to  appear  Each  object  can  behave 
difTerently:  when  a  student  clicks  on  a  meter,  a  question  is  asked,  when  he'she  clicks  on  a  resistor  a 
special  menu  of  options  is  presented 
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Figure  3  Examples  of  Different  Knowledges  Needed  for  Steering  Testing 
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Minneapolis.  MN  55455 

Or.  A.  F.  Norcio 

Computer  Science  and  Systems 

Code:  7590 

Information  Technology  Division 
Naval  Research  Laboratury 
Washington,  DC  20375 

Dr.  Donald  A.  Norman 
Institute  for  Cognitive 
Science  C-015 

University  of  California.  San  Diego 
La  Jolla.  California  92093 

Deputy  Technical  Director, 

NPRDC  Code  OlA 
San  Diego.  CA  92152-6800 

Director.  Training  Laboratory, 

NPRDC  (Code  05) 

San  Diego.  CA  92152-6800 

Director,  Manpower  and  Personnel 
Laboratory , 

NPRDC  (Code  06) 

San  Diego.  CA  92152-6800 
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Director.  Human  Factors 

&  Organizational  Systems  Lab. 
NPRDC  (Code  07) 

San  Diego.  CA  92152-6800 

Fleet  Support  Office. 

NPROC  (Code  301) 

San  Diego.  CA  92152-6800 

Library.  NPRDC 
Code  P201L 

San  Diego.  CA  92152-6800 

Technical  Director, 

Navy  Personnel  R&D  Center 
San  Diego,  CA  92152-6800 

Commanding  Officer. 

Naval  Research  Laboratory 
Code  2627 

Washington,  DC  20390 

Or.  Harold  F.  O'Neil,  Jr. 

School  of  Education  -  WPH  801 
Oepartment  of  Educational 
Psychology  &  Technology 
University  of  Southern  California 
Los  Angeles,  CA  90089-0031 

Or.  Michael  Oberlin 

Naval  Training  Systems  Center 

Code  711 

Orlando.  FL  32813-7100 

Or.  Stellan  Ohlsson 
Learning  R  i  0  Center 
University  of  Pittsburgh 
3939  O'Hara  Street 
Pittsburgh,  PA  15213 

Office  of  Naval  Research. 

Code  1142BI 
800  N.  Quincy  Street 
Arlington,  VA  22217-5000 

Office  of  Naval  Research. 

Code  1142 
800  N.  Quincy  St. 

Arlington,  VA  22217-5000 


Office  of  Naval  Research. 

Code  1142P5 
800  N.  Qumcy  Street 
Arlington.  VA  22217-5000 

Office  of  Naval  Research. 

Code  1142CS 
800  N.  Quincy  Street 
Arlington,  VA  22217-5000 
( 6  Copies ) 

Psycholog i st . 

Office  of  Naval  Research 
Branch  Office.  London 
Box  39 

FPO  Ne»  York .  NY  09510 

Special  Assistant  for  Marine 
Corps  Matters. 

ONR  Code  OOMC 
800  N.  Quincy  St. 

Arlington.  VA  22217-SOOO 

Psychologist . 

Office  of  Naval  Research 
Liaison  Office.  Far  East 
APO  San  Francisco.  CA  96503 

Or.  Judith  Orasanu 
Army  Research  Institute 
5001  Eisenhower  Avenue 
Alexandria.  VA  22333 

Or  Douglas  Pearse 

DCIEM 

Box  2000 

Downs view.  Ontario 
CANADA 

Dr.  James  W.  Pellegrino 
University  of  California. 

Santa  Barbara 
Oepartment  of  Psychology 
Santa  Barbara.  CA  93106 

Dr.  Virginia  E.  Pendergrass 
Code  711 

Naval  Training  Systems  Center 
Orlando.  FL  32813-7100 
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Or.  Nancy  Pennington 
University  of  Chicago 
Graduate  School  of  Business 
1101  E.  58th  St. 

Chicago,  IL  60637 

Military  Assistant  for  Training  and 
Personnel  Technology, 

OUSO  (R  S  E) 

Room  3D129,  The  Pentagon 
Washington,  DC  20301-3080 

Dr.  Steven  Pinker 
Department  of  Psychology 
ElO-018 
M.I.T. 

Cambridge.  MA  02139 

Or.  Martha  Poison 
Department  of  Psychology 
Campus  Box  346 
University  of  Colorado 
Boulder,  CO  80309 

Or.  Peter  Poison 
Univeriity  of  Colorado 
Department  of  Psychology 
Boulder,  CO  80309 

Or.  MAchael  I.  Posner 
Department  of  Neurology 
Washington  University 
Medical  School 
St.  Louis.  MO  63110 

Or.  Mary  C.  Potter 
Department  of  Psychology 
MIT  (E-lO-032) 

Cambridge,  MA  02139 

Or.  Paul  S.  Rau 
Coda  U-32 

Naval  Surface  Weapons  Center 
White  Oak  Laboratory 
Silver  Spring.  MO  20903 

Or.  Lynne  Radar 
Department  of  Psychology 
Carnegie-Mel Ion  University 
Schenlay  Park 
Pittsburgh.  PA  15213 


Or.  James  A.  Reggia 
University  of  Maryland 
School  of  Medicine 
Department  of  Neurology 
22  South  Greene  Street 
Baltimore,  MD  21201 

Or.  Wesley  Regian 
AFHRL/MOO 

Brooks  AFB.  TX  78235 

Or.  Fred  Reif 
Physics  Department 
University  of  California 
Berkeley.  CA  94720 

Or.  Gil  Ricard 
Mall  Stop  C04-14 
Grumman  Aerospace  Corp. 
Bethpage.  NY  11714 

Or.  Linda  G.  Roberts 
Science.  Education,  and 
Transportation  Program 
Office  of  Technology  Assessment 
Congress  of  the  United  States 
Washington,  DC  20510 

Or.  Paul  R.  Rosenbaum 
Educational  Testing  Service 
Princeton,  NJ  08541 

Or.  William  B.  Rouse 
Search  Technology.  Inc. 

5550-A  Peachtree  Parkway 
Technology  Park/Summit 
Norcross.  GA  30092 

Or.  David  Rumelhart 
Center  for  Human 

Information  Processing 
Uni V  .  of  Cal i f orn ia 
La  Jolla.  CA  92093 

Or.  Walter  Schneider 
Learning  R&O  Center 
University  of  Pittsburgh 
3939  O'Hara  Street 
Pittsburgh.  PA  15260 
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Or.  Miriam  Schustack 
Code  51 

Navy  Personnel  R  S  0  Center 
San  Diego.  CA  92152-6800 

Or.  Marc  Sebrechts 
Department  of  Psychology 
Wesleyan  University 
Middletown,  CT  06475 

Or.  Colleen  M.  Seifert 
Intelligent  Systems  Group 
Institute  for 

Cognitive  Science  (C-015) 
UCSO 

La  Jolla.  CA  92093 

Or.  Sen  Shneiderman 
Dept,  of  Computer  Science 
University  of  Maryland 
College  Park.  M(i  20742 

Or.  Robert  S.  Siegler 
Carneg ie-Mel 1  on  University 
Department  of  Psychology 
Schenley  Park 
Pittsburgh.  PA  15213 

Or.  Herbert  A.  Simon 
Department  of  Psychology 
Carnegie-Mel Ion  University 
Schenley  Park 
Pittsburgh,  PA  15213 

LTCOL  Robert  Simpson 
Defense  Advanced  Research 
Projects  Administration 
1400  Wilson  81vd. 

Arlington.  VA  22209 

Or.  H.  Wallace  Smaiko 
Manpower  Research 

and  Advisory  Services 
Smithsonian  Institution 
801  North  Pitt  Street 
Alexandria.  VA  22314 

Or.  Richard  E.  Snow 
Department  of  Psychology 
Stanford  University 
Stanford.  CA  94306 


Dr.  Richard  Sorensen 
Navy  Personnel  R&D  Center 
San  Oiego,  CA  92152-6800 

Dr.  Kathryn  T.  Spoehr 
Brown  University 
Department  of  Psychology 
Providence,  RI  02912 

Dr.  James  J.  Staszewski 
Research  Associate 
Carnegie-Mellon  University 
Department  of  Psychology 
Schenley  Park 
Pittsburgh,  PA  15213 

Or.  Robert  Sternberg 
Department  of  Psychology 
Yale  University 
Box  HA.  Yale  Station 
New  Haven.  CT  06520 

Dr.  Kurt  Steuck 
AFHRL/MOO 
Brooks  AFB 

San  Antonio  TX  78235 

Or.  Paul  J.  Sticha 
Senior  Staff  Scientist 
Training  Research  Division 
HumRRO 

1100  S.  Washington 
Alexandria.  VA  22314 

Or.  John  Tangney 
AFOSR/NL 

Bolling  AFB,  DC  20332 

Or.  Kikumi  TatSuoka 
CERL 

252  Engineering  Research 
Laboratory 
Urbana.  IL  61801 

Or.  Perry  W.  Thorndyke 
FMC  Corporation 
Central  Engineering  Labs 
1185  Coleman  Avenue.  Box  580 
Santa  Clara.  CA  95052 
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Or.  Sharon  Tkacz 
Army  Research  Institute 
5001  Elsenhower  Avenue 
Alexandria,  VA  22333 

Or.  Douglas  Towne 
Behavioral  Technology  Labs 
1845  S.  Elena  Ave. 

Redondo  Beach.  CA  90277 

Headquarters,  U.  S.  Marine  Corps 
Code  MPI-20 
Washington,  DC  20380 

Dr.  Will  lam  Uttal 
NDSC,  Hawaii  Lab 
Box  997 

Kailua.  HI  96734 

Or.  Kurt  Van  Lehn 
Department  of  Psychology 
Carnegle-Mel Ion  University 
Schenley  Park 
Pittsburgh,  PA  15213 

Or.  Beth  Warren 

Bolt  Beranek  &  Newman,  Inc. 

50  Moulton  Street 
Cambridge.  MA  02138 

Or.  Keith  T.  Wescourt 
EMC  Corporation 
Central  Engineering  Labs 
1185  Coleman  Ave.,  Box  580 
Santa  Clara,  CA  95052 

Or.  Douglas  Wetzel 
Code  12 

Navy  Personnel  R&O  Center 
San  Diego,  CA  92152-6800 

Or.  Barbara  White 

Bolt  Beranek  &  Newman.  Inc. 

10  Moulton  Street 
Cambridge,  MA  02238 

Dr.  Christopher  Wickens 
Department  of  Psychology 
University  of  Ill Inols 
Champaign,  IL  61820 


Or.  Heather  Wild 
Naval  Air  Development 
Center 
Code  6021 

Warminster,  PA  18974-5000 

Or.  Robert  A.  Wisher 
U.S.  Army  Institute  for  the 

Behavioral  and  Social  Sciences 
5001  Eisenhower  Avenue 
Alexandria,  VA  22333 

Dr.  Martin  F.  Wiskoff 
Navy  Personnel  R&O  Center 
San  Diego.  CA  92152-6800 

Or.  Dan  Wolz 
AFHRL/MOE 

Brooks  AFB.  TX  78235 

Dr.  Wallace  Wulfeck.  Ill 
Navy  Personnel  R&O  Center 
San  Diego,  CA  92152-6800 

Or.  Joe  Yasatuke 
AFHRL/LRT 

Lowry  AFB,  CO  80230 

Or.  Joseph  L.  Young 
Memory  &  Cognitive 
Processes 

National  Science  Foundation 
Washington.  DC  20550 
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